W7-W8. Pushdown Transducers, Operations on DPDA, Automata Theory and Models of Computation

Author

Manuel Mazzara

Published

March 5, 2026

1. Summary

1.1 Recap: Finite State Transducers

Before introducing the more powerful Pushdown Transducer, it is important to recall how a Finite State Transducer (FST) works, since a PDT is its direct generalization.

A Finite State Transducer is a Finite State Automaton extended with an output mechanism. Formally, an FST is a tuple \(\langle Q, I, \delta, q_0, F, O, \eta \rangle\) where:

\(\langle Q, I, \delta, q_0, F \rangle\) is a standard FSA (states, input alphabet, transition function, initial state, accepting states);
\(O\) is the output alphabet — the set of symbols that can be written to the output;
\(\eta : Q \times I \to O^*\) is the translation function — it maps each state-input pair to an output string.

Key remark: The translation \(\eta\) is only applied to strings that are accepted by the underlying FSA. If an input string is rejected, its translation is undefined.

How an FST works: As the FST reads each input symbol, it simultaneously (a) changes state (via \(\delta\)) and (b) writes an output string (via \(\eta\)). The arrow label on a transition diagram is written as input/output, for example a/A means “read \(a\) from input, write \(A\) to output.”

Example — Simple case-converter:

Consider \(L = \{x \in \{a, b\}^*\}\) (all strings over \(\{a,b\}\)) with translation \(a \mapsto A\), \(b \mapsto B\).

The FST has a single accepting state \(q_0\) with self-loops:

\(q_0 \xrightarrow{a/A} q_0\): read \(a\), output \(A\)
\(q_0 \xrightarrow{b/B} q_0\): read \(b\), output \(B\)

On input \(aba\), the FST outputs \(ABA\).

Example — FST with restricted domain:

Consider \(L = \{x \in \{a,b\}^* \mid \text{no } b\text{'s in } x\}\) with the same translation. Now the FST has two states:

\(q_0\) (accepting): reads \(a\) and outputs \(A\); if it reads \(b\), goes to the non-accepting “sink” state \(q_1\)
\(q_1\) (rejecting): reads any symbol and outputs \(\varepsilon\) (because the string will be rejected anyway)

On input \(aaa\), output is \(AAA\). On input \(aba\), the string is rejected — translation is undefined.

Limitation of FSTs: FSTs, like FSAs, have only fixed, finite memory (the current state). They cannot handle languages requiring counting, such as \(L = \{a^n b^n \mid n \geq 1\}\), where the output must track how many \(a\)’s were read to produce the right number of output symbols. We need a Pushdown Transducer.

1.2 Pushdown Transducers (PDTs)

1.2.1 Motivation and Intuition

A Pushdown Transducer (PDT) extends a Pushdown Automaton with an output tape, just as a Finite State Transducer extends an FSA. The PDT can use its stack not only to recognize the language (as in a PDA) but also to assist the translation: the stack stores intermediate information needed to generate the correct output.

The architecture of a PDT consists of four components:

Input tape: a read-only tape from which the PDT reads symbols left-to-right
Finite state control: a finite set of states with transition logic
Stack: a LIFO (last-in, first-out) memory that can grow without bound
Output tape: a write-only tape to which the PDT appends output symbols

Architecture of a Pushdown Transducer: input tape, finite control, stack, and output tape

The stack is a destructive memory: once a symbol is popped off the stack, it is gone. This is both the source of the PDT’s power and its ultimate limitation.

A transition of a PDT reads an input symbol (or \(\varepsilon\)), inspects the top of the stack, changes state, rewrites the top of the stack, and writes to the output tape — all in one atomic step.

1.2.2 Transition Notation

A transition from state \(q\) to state \(p\) is drawn on an arrow and labeled:

\[i,\, A / \alpha,\, w\]

where:

\(i \in I \cup \{\varepsilon\}\) is the input symbol consumed (or \(\varepsilon\) for a silent move)
\(A \in \Gamma\) is the stack symbol popped from the top
\(\alpha \in \Gamma^*\) is the string pushed back onto the stack (replacing \(A\))
\(w \in O^*\) is the string written to the output tape

An alternative notation uses a colon to separate the stack part from the output:

\[i,\, A / \alpha : w\]

Both notations are used in the literature and mean the same thing.

This means that transition \(\delta(q, i, A) = (p, \alpha)\) and \(\eta(q, i, A) = w\) are combined in the diagram as the label \(i, A/\alpha, w\).

1.2.3 Formal Definition

A Pushdown Transducer is a 9-tuple:

\[\text{PDT} = \langle Q,\, I,\, \Gamma,\, \delta,\, q_0,\, Z_0,\, F,\, O,\, \eta \rangle\]

where:

\(Q\) — finite set of states
\(I\) — finite input alphabet
\(\Gamma\) — finite stack alphabet
\(\delta: Q \times (I \cup \{\varepsilon\}) \times \Gamma \to \mathcal{P}(Q \times \Gamma^*)\) — the transition function (partial; maps to finite subsets)
\(q_0 \in Q\) — initial state
\(Z_0 \in \Gamma\) — initial stack symbol (bottom-of-stack marker)
\(F \subseteq Q\) — accepting states
\(O\) — finite output alphabet
\(\eta: Q \times (I \cup \{\varepsilon\}) \times \Gamma \to O^*\) — the translation function (defined only where \(\delta\) is defined)

Remarks:

\(Q, I, \Gamma, \delta, q_0, Z_0, F\) are the same components as in a standard “acceptor” PDA.
\(\eta\) is defined only where \(\delta\) is defined — there is no output on undefined transitions.
The stack can be necessary for two distinct reasons: (a) the language being recognized requires it, and/or (b) the translation requires it.

1.2.4 Configurations and Transitions

A configuration of a PDT is a 4-tuple:

\[c = \langle q,\, x,\, \gamma,\, z \rangle\]

where:

\(q \in Q\) — the current state of the finite control
\(x \in I^*\) — the unread portion of the input string
\(\gamma \in \Gamma^*\) — the current contents of the stack (top symbol first)
\(z \in O^*\) — the string already written on the output tape

Transitions between configurations:

Input move (consuming symbol \(i\)): If \(\delta(q, i, A) = (q', \alpha)\) and \(\eta(q, i, A) = w\), then: \[\langle q,\, iy,\, A\gamma,\, z \rangle \vdash \langle q',\, y,\, \alpha\gamma,\, zw \rangle\]
\(\varepsilon\)-move (silent, no input consumed): If \(\delta(q, \varepsilon, A) = (q', \alpha)\) and \(\eta(q, \varepsilon, A) = w\), then: \[\langle q,\, x,\, A\gamma,\, z \rangle \vdash \langle q',\, x,\, \alpha\gamma,\, zw \rangle\]

Note the key difference: in an \(\varepsilon\)-move, the unread input \(x\) is unchanged (the PDT consumes no input symbol).

1.2.5 Acceptance Condition

The PDT translates a string \(x\) to \(z = \tau(x)\) if and only if starting from the initial configuration \(c_0 = \langle q_0, x, Z_0, \varepsilon \rangle\), the PDT reaches a final configuration \(c_F = \langle q_f, \varepsilon, \gamma, z \rangle\) with \(q_f \in F\) (all input consumed, in an accepting state):

\[\forall x \in I^*:\quad x \in L \wedge z = \tau(x) \iff \langle q_0, x, Z_0, \varepsilon \rangle \vdash^* \langle q_f, \varepsilon, \gamma, z \rangle \text{ for some } q_f \in F\]

This is defined analogously to how translation works in an FST: the translation of \(x\) is defined only if the string \(x\) is accepted.

1.3 PDT Examples

1.3.1 Translating \(a^n b^n \to A^n B^n\)

Consider \(L = \{a^n b^n \mid n \geq 1\}\) with translation \(\tau(a^n b^n) = A^n B^n\) (capitalize each symbol).

Strategy: Use the stack to count the \(a\)’s (push \(A\) for each \(a\)), then verify the \(b\)’s while outputting \(B\) for each matched \(A\).

The PDT has states \(q_0, q_1, q_2, q_3\) (accepting):

\(q_0 \to q_1\): transition \(a, Z_0 / AZ_0, A\) — read first \(a\), push \(A\) onto \(Z_0\), output \(A\)
\(q_1 \circlearrowleft\): transition \(a, A / AA, A\) — read each additional \(a\), push another \(A\), output \(A\)
\(q_1 \to q_2\): transition \(b, A / \varepsilon, B\) — start reading \(b\)’s, pop one \(A\), output \(B\)
\(q_2 \circlearrowleft\): transition \(b, A / \varepsilon, B\) — each additional \(b\), pop one \(A\), output \(B\)
\(q_2 \to q_3\): transition \(\varepsilon, Z_0 / Z_0, \varepsilon\) — if \(Z_0\) is on top after consuming all input, accept (no output)

PDT for translating aⁿbⁿ → AⁿBⁿ

On input \(aba\): rejected (not in \(L\)), translation undefined.

1.3.2 Translating \(a^n b^n \to b^n\)

Consider \(L = \{a^n b^n \mid n > 0\}\) with translation \(\tau(a^n b^n) = b^n\) (output only the \(b\)-part).

Strategy: During the push phase (reading \(a\)’s), output nothing. Only during the pop phase (reading \(b\)’s), output \(b\) for each matched \(A\).

\(q_0\): \(a, Z_0 / AZ_0, \varepsilon\) and \(a, A / AA, \varepsilon\) — push \(A\)’s with no output
\(q_0 \to q_1\): \(b, A / \varepsilon, b\) — first \(b\), pop one \(A\), output \(b\)
\(q_1 \circlearrowleft\): \(b, A / \varepsilon, b\) — each additional \(b\), output \(b\)
\(q_1 \to q_F\): \(\varepsilon, Z_0 / Z_0, \varepsilon\) — accept

Result: \(a^n b^n \mapsto b^n\).

1.3.3 Translating \(a^n b^n \to b^n a^n\)

Consider \(L = \{a^n b^n \mid n > 0\}\) with translation \(\tau(a^n b^n) = b^n a^n\) (swap the blocks).

Strategy: During the push phase (reading \(a\)’s), output \(b\) (anticipate the output). During the pop phase (reading \(b\)’s), output \(a\) for each matched symbol.

\(q_0\): \(a, Z_0 / AZ_0, b\) and \(a, A / AA, b\) — push \(A\)’s, output \(b\) for each
\(q_0 \to q_1\): \(b, A / \varepsilon, a\) — pop \(A\), output \(a\)
\(q_1 \circlearrowleft\): \(b, A / \varepsilon, a\) — each additional \(b\), output \(a\)
\(q_1 \to q_F\): \(\varepsilon, Z_0 / Z_0, \varepsilon\) — accept

Result: \(a^n b^n \mapsto b^n a^n\). This example highlights that the stack enables “inverted” translations not possible with an FST.

1.3.4 Translating \(wc \to w^R\) (Reverse a String)

Consider \(L = \{wc \mid w \in \{a,b\}^+\}\) with translation \(\tau(wc) = w^R\) (reverse of \(w\), where \(c\) is a separator).

Strategy (PUSH-THEN-POP): While reading \(w\), push each symbol onto the stack and output nothing. When the separator \(c\) is encountered, stop pushing and start popping — outputting each symbol as it is popped. Since the stack is LIFO, popping gives symbols in reverse order.

State \(q_0\) (PUSH TIME — NO OUTPUT):
- \(a, Z_0 / AZ_0, \varepsilon\); \(a, A / AA, \varepsilon\); \(a, B / AB, \varepsilon\)
- \(b, Z_0 / BZ_0, \varepsilon\); \(b, B / BB, \varepsilon\); \(b, A / BA, \varepsilon\)
Transition \(q_0 \to q_1\) (STOP PUSH):
- \(c, A / \varepsilon, a\); \(c, B / \varepsilon, b\)
State \(q_1\) (POP TIME — OUTPUT GENERATED):
- \(\varepsilon, A / \varepsilon, a\); \(\varepsilon, B / \varepsilon, b\)
Transition \(q_1 \to q_2\) (accepting): \(\varepsilon, Z_0 / Z_0, \varepsilon\)

PDT for reversing wc → wᴿ

On input \(abc\): push \(A\), push \(B\), then see \(c\) — pop \(B\) (output \(b\)), pop \(A\) (output \(a\)). Final output: \(ba = (ab)^R\). ✓

This example illustrates a key property: the stack enables translations that require “remembering” the input in order to produce output in a non-left-to-right order.

1.4 Closure Properties of DPDA Languages

We now turn from transducers to a fundamental question in formal language theory: which operations preserve the class of languages recognized by Deterministic Pushdown Automata (DPDAs)?

This is the question of closure properties. A class of languages \(\mathcal{C}\) is closed under an operation \(\mathcal{O}\) if applying \(\mathcal{O}\) to languages in \(\mathcal{C}\) always produces a language that is also in \(\mathcal{C}\).

Formally: \(\mathcal{C}\) is closed under \(\mathcal{O}\) if for any \(L_1, \ldots, L_n \in \mathcal{C}\), we have \(\mathcal{O}(L_1, \ldots, L_n) \in \mathcal{C}\).

Motivation: For regular languages (recognized by FSAs), the class is closed under all four standard operations — union, intersection, complement, and difference. We want to know whether deterministic context-free languages (DPDAs) enjoy the same nice properties.

Closure properties: FSA vs DPDA

Spoiler (summary table):

	\(\cup\)	\(\cap\)	\(\setminus\)	\({}^c\)
FSA (Regular)	yes	yes	yes	yes
DPDA (Det. CFL)	no	no	no	yes

We now examine each operation in turn and understand why the result is what it is.

1.4.1 Closure Under Union: NO

Claim: DPDA is not closed under union.

Counterexample: Consider the following two languages over \(\Sigma = \{a, b\}\):

\[L_1 = \{a^n b^n \mid n \geq 1\}, \quad L_2 = \{a^n b^{2n} \mid n \geq 1\}\]

Both \(L_1\) and \(L_2\) are individually recognizable by DPDAs (each is a deterministic context-free language). However, their union:

\[L_1 \cup L_2 = \{a^n b^n \mid n \geq 1\} \cup \{a^n b^{2n} \mid n \geq 1\}\]

cannot be recognized by any DPDA.

Why? The intuitive reason is the stack preparation problem. When a DPDA reads the \(a\)’s, it must decide in advance — during the push phase — whether to prepare the stack to match \(n\) \(b\)’s (for \(L_1\)) or \(2n\) \(b\)’s (for \(L_2\)). But this decision can only be made after reading all the \(a\)’s and then seeing the \(b\)’s. By that point, the stack content is already fixed. A nondeterministic PDA could explore both possibilities in parallel, but a deterministic one cannot.

More precisely: after reading \(a^n\), the DPDA’s stack contains some encoding of \(n\). When it starts reading \(b\)’s, it cannot simultaneously try to match them against \(n\) (for \(L_1\)) and \(2n\) (for \(L_2\)) — it must commit to one interpretation. If it commits to \(L_1\) and the string belongs to \(L_2\) (or vice versa), it fails.

Contrast with regular languages: For FSAs, taking the union is easy — build the cross-product automaton and accept if either component accepts. For DPDAs, no analogous construction works in general because the stack contents for \(L_1\) and \(L_2\) may be incompatible.

\(\Rightarrow\) DPDA is NOT closed under \(\cup\).

1.4.2 Closure Under Intersection: NO

Claim: DPDA is not closed under intersection.

Counterexample: Consider languages over \(\Sigma = \{a, b, c\}\):

\[L_1 = \{a^n b^n c^m \mid n, m > 0\}, \quad L_2 = \{a^m b^n c^n \mid n, m > 0\}\]

Both are recognizable by DPDAs:

\(L_1\): count \(a\)’s, match \(b\)’s one-for-one, then accept any positive number of \(c\)’s.
\(L_2\): skip any positive number of \(a\)’s, then count \(b\)’s, match \(c\)’s one-for-one.

Their intersection:

\[L_1 \cap L_2 = \{a^n b^n c^n \mid n > 0\}\]

This is the classic language \(\{a^n b^n c^n\}\), which is not recognized by any PDA (not even a nondeterministic one), as provable via the Bar-Hillel (Pumping) Lemma for context-free languages. Therefore it certainly cannot be recognized by a DPDA.

\(\Rightarrow\) DPDA is NOT closed under \(\cap\).

Proof via complement: There is also an algebraic way to see this, using De Morgan’s law:

\[A \cup B = (A^c \cap B^c)^c\]

Since DPDA is closed under complement (see below), if it were also closed under intersection, it would be closed under union — but we just showed it is not. Contradiction.

1.4.3 Closure Under Complement: YES

Claim: DPDA is closed under complement.

Theorem: If \(L \in \mathbf{DPDA}\), then \(L^c \in \mathbf{DPDA}\).

Why complement works for DPDA but not NPDA: For nondeterministic PDAs, complementation fails because a string might be rejected on all computation paths but accepted on some — the nondeterminism makes it hard to “flip” the acceptance condition. For deterministic PDAs, each input leads to exactly one computation, so we can cleanly distinguish accepted from rejected strings.

Construction sketch — why simply swapping final states does not work:

For FSAs, complementing is simple: swap final and non-final states. For PDAs, this naive approach fails because of \(\varepsilon\)-moves. Consider: the PDA with input \(x\) ends in a non-final state \(q\), but could reach a final state \(q'\) via an \(\varepsilon\)-move. If we flip states, \(q\) becomes final, so the complemented PDA also accepts \(x\) — meaning both the original and its “complement” accept \(x\). This is wrong.

Correct construction — three steps:

Eliminate loops: Ensure the DPDA is acyclic — it always reads the entire input string and then stops, with no \(\varepsilon\)-moves possible after the last input symbol is consumed. Every DPDA can be transformed into an equivalent acyclic DPDA (this is a non-trivial but provable fact). This prevents the \(\varepsilon\)-move problem above.
Complete \(\delta\): Make the transition function total by adding a non-accepting “garbage” state \(q_{\text{err}}\) for all undefined transitions. After this, the DPDA is a total function from configurations to configurations.
Swap final and non-final states: Now that the machine always halts on a well-defined state after reading the entire input (with no \(\varepsilon\)-move ambiguity), swapping accepting and non-accepting states correctly complements the language.

Why acyclicity matters: Without eliminating \(\varepsilon\)-loops, after reading the entire input string, the DPDA might loop forever through \(\varepsilon\)-moves, passing through some final and some non-final states. With acyclicity, once all input is consumed, the DPDA is in a single well-defined state, and swapping acceptance correctly inverts the language.

\(\Rightarrow\) DPDA IS closed under \({}^c\).

1.4.4 Closure Under Difference: NO

Claim: DPDA is not closed under difference (set subtraction).

Proof by contradiction: Suppose DPDA were closed under \(\setminus\). We know that for any sets \(A, B\):

\[A \cap B = A \setminus B^c\]

Since DPDA is closed under complement, \(B^c \in \mathbf{DPDA}\) whenever \(B \in \mathbf{DPDA}\). If DPDA were also closed under \(\setminus\), then \(A \setminus B^c \in \mathbf{DPDA}\), which means \(A \cap B \in \mathbf{DPDA}\). But we proved DPDA is not closed under intersection. Contradiction!

Therefore DPDA is not closed under \(\setminus\).

\(\Rightarrow\) DPDA is NOT closed under \(\setminus\).

Comparison with regular languages: Regular languages (FSA) are closed under all four operations. Deterministic context-free languages have a more asymmetric closure profile — they are only closed under complement, but not under union, intersection, or difference.

1.5 The Bar-Hillel Lemma (Pumping Lemma for CFLs)

The Bar-Hillel Lemma is the generalization of the Pumping Lemma from regular languages to context-free languages. Just as the Pumping Lemma for regular languages gives a necessary condition for regularity, Bar-Hillel gives a necessary (but not sufficient) condition for being a context-free language.

Bar-Hillel Lemma: If \(L \subseteq \Sigma^*\) is a context-free language (recognized by some NPDA), then there exists \(m \geq 1\) such that every \(w \in L\) with \(|w| \geq m\) can be written as:

\[w = x_1 x_2 x_3 x_4 x_5\]

satisfying:

\(|x_2 x_4| > 0\) (the “pumped” parts are not both empty)
\(|x_2 x_3 x_4| \leq m\) (the pumped region is bounded)
\(x_1 x_2^i x_3 x_4^i x_5 \in L\) for all \(i \geq 1\) (pumping any number of times stays in \(L\))

Note: unlike the regular Pumping Lemma (which pumps one substring \(y\)), Bar-Hillel pumps two substrings simultaneously — \(x_2\) and \(x_4\) — always by the same amount \(i\). This reflects the nested structure of context-free derivations.

Application — proving \(\{a^n b^n c^n \mid n > 0\}\) is not a CFL:

The language \(L = \{a^n b^n c^n \mid n > 0\}\) is not recognized by any PDA (not even nondeterministic). This can be proved using the Bar-Hillel Lemma:

Assume for contradiction that \(L\) is a CFL. Let \(m\) be the Bar-Hillel constant. Choose \(w = a^m b^m c^m \in L\). For any decomposition \(w = x_1 x_2 x_3 x_4 x_5\) with \(|x_2 x_4| > 0\) and \(|x_2 x_3 x_4| \leq m\):

The region \(x_2 x_3 x_4\) is short (length \(\leq m\)), so it can span at most two of the three blocks (\(a\)’s, \(b\)’s, \(c\)’s).
Pumping up (\(i = 2\)) increases the counts of at most two symbol types, but not all three equally.
Therefore \(x_1 x_2^2 x_3 x_4^2 x_5\) has unequal counts of \(a\)’s, \(b\)’s, and \(c\)’s — it is not in \(L\).

Contradiction. So \(L = \{a^n b^n c^n\}\) is not context-free.

1.6 The Stack as Destructive Memory and Limits of PDAs

1.6.1 Destructive vs. Persistent Memory

A key insight about PDAs is that the stack is a destructive memory: once a symbol is popped off the stack, it is irrecoverably gone. This is precisely why:

A PDA can recognize \(\{a^n b^n\}\): push \(n\) symbols while reading \(a\)’s, then pop one per \(b\).
A PDA cannot recognize \(\{a^n b^n c^n\}\): to check the \(c\)’s, it needs to know \(n\) again — but the stack was already emptied verifying the \(b\)’s.

To overcome this limitation, one needs persistent (read/write) memory — a device that can read a memory location without destroying it. This is the motivation for Turing Machines, which use an unbounded read/write tape instead of a stack.

1.6.2 The Language Hierarchy

Language hierarchy: Regular languages ⊂ Context-Free ⊂ Recursively Enumerable

The hierarchy of languages recognized by different computational models is:

Regular languages ⊂ Context-free languages (CFLs) ⊂ Recursively enumerable languages (all TM-computable languages)

Examples that illustrate the boundaries:

\(\{a^n b^n \mid n \geq 1\}\): not regular (proved by Pumping Lemma), but context-free (recognized by a PDA)
\(\{a^n b^n c^n \mid n > 0\}\): not context-free (proved by Bar-Hillel), but computable by a Turing Machine
\(\{a^n b^n \mid n \geq 1\} \cup \{a^n b^{2n} \mid n \geq 1\}\): context-free (recognized by an NPDA) but not a deterministic context-free language

1.7 Why PDAs Are at the Heart of Compilers

PDAs are not merely theoretical constructs — they have direct, practical applications in software engineering, particularly in compiler construction.

1.7.1 PDA and Compilers

The stack memory in a PDA has a LIFO (Last-In, First-Out) policy. This is precisely the right structure for analyzing nested syntactic structures, such as:

Arithmetic expressions with parentheses: (a + (b * c))
Block-structured programs with begin/end or {/}
Function call chains with activation records (each call pushes a frame, each return pops it)

The general structure of a compiler proceeds through several phases:

Lexical Analysis: recognizes tokens (identifiers, keywords, operators) — modeled by FSAs
Syntax Analysis (Parsing): recognizes the grammatical structure of programs — modeled by NPDAs (context-free grammars describe most programming language syntax)
Semantic Analysis, Code Generation, Optimization: further processing

This is why the syntax of almost all programming languages is defined by a context-free grammar, and why parsers (the software that performs syntax analysis) are essentially pushdown automata.

1.7.2 Fixed vs. Unbounded Memory

A crucial conceptual distinction separates FSAs from PDAs:

FSA: fixed (bounded) memory — the machine has a finite number of states, period. It cannot count to an arbitrary \(n\).
PDA: finite but unbounded memory — the machine has a finite set of states, but the stack can grow without bound. It can count to any \(n\) by pushing \(n\) symbols.

For example, \(L = \{a^n b^n\}\) requires counting an arbitrary \(n\). An FSA with any fixed number of states \(k\) will fail for \(n > k\). A PDA, by using its unbounded stack, succeeds for all \(n\).

1.8 Automata Theory and Models of Computation

1.8.1 What is Automata Theory?

Automata theory is the branch of theoretical computer science that studies:

Abstract mathematical machines (automata) and their computational properties
The computational problems that can be solved by various types of automata

The word automaton (plural: automata) comes from the Greek αὐτόματον, meaning “self-moving” — something that acts by itself. The word was first used by Homer (describing automatic door opening and self-moving statues in ancient Greek epic poetry).

1.8.2 Why Study Automata Theory?

An automaton is a finite representation of a formal language that may be infinite — a finite machine can describe infinitely many strings.
Automata serve as theoretical models for computing machines, enabling rigorous proofs about computability and complexity.
They have direct practical applications: compiler design, model checking, protocol verification, pattern matching.

1.8.3 Models of Computation

A model of computation describes how outputs are computed from inputs, and how computational units, memories, and communications are organized. Different models capture different levels of computational power:

Sequential models:

Finite State Automata (FSA) — limited expressiveness, fixed memory; suitable for pattern matching and “brute-force” analysis (model checking)
Pushdown Automata (PDA) — adds a stack; can handle nested structures; basis for parsing
Turing Machine — unlimited read/write tape; the most general model of sequential computation

Functional models:

Lambda calculus — computation via function application and reduction

Concurrent models:

Petri nets — for modeling concurrent, distributed systems

This list is not exhaustive — there are many other models (register machines, cellular automata, etc.).

1.8.4 Applications of FSAs in Practice

FSAs (Finite State Automata) have wide practical applications:

Moore/Mealy machines model digital circuits and electronic devices
- Mealy machines are essentially finite state transducers (FSTs) — they produce output on transitions
Lexical analysis in compilers: FSAs recognize tokens (the “words” of a programming language)
UML state machines: the Unified Modeling Language uses FSA-based notation for describing system behavior (e.g., a temperature controller with states Idle, Heating, Cooling)
Protocol modeling: turnstile gates, vending machines, communication protocols can all be modeled as FSAs

The key limitation of FSAs is their fixed memory: they are suitable for patterns that do not require counting or remembering unbounded amounts of data.

2. Definitions

Finite State Transducer (FST): A finite state automaton extended with an output mechanism. A 7-tuple \(\langle Q, I, \delta, q_0, F, O, \eta \rangle\) where \(O\) is the output alphabet and \(\eta: Q \times I \to O^*\) is the translation function. Translation is defined only on accepted strings.
Pushdown Transducer (PDT): A pushdown automaton extended with an output tape. A 9-tuple \(\langle Q, I, \Gamma, \delta, q_0, Z_0, F, O, \eta \rangle\) where \(O\) is the output alphabet and \(\eta: Q \times (I \cup \{\varepsilon\}) \times \Gamma \to O^*\) is the translation function. The stack can store information needed both for recognition and translation.
PDT Configuration: A 4-tuple \(\langle q, x, \gamma, z \rangle\) where \(q\) is the current state, \(x\) is the unread input, \(\gamma\) is the stack contents (top first), and \(z\) is the output already written.
Translation Function (\(\eta\)): The component of a PDT (or FST) that specifies what output to write on each transition. \(\eta(q, i, A) = w\) means: in state \(q\), reading \(i\) with stack top \(A\), write \(w\) to the output tape. Defined only where \(\delta\) is defined.
PDT Transition Label: Written as \(i, A/\alpha, w\) (or \(i, A/\alpha : w\)) on a diagram arrow, meaning: read \(i\) from input, pop \(A\), push \(\alpha\), write \(w\) to output.
Closure Under an Operation: A class of languages \(\mathcal{C}\) is closed under operation \(\mathcal{O}\) if for any \(L_1, \ldots, L_n \in \mathcal{C}\), the result \(\mathcal{O}(L_1, \ldots, L_n) \in \mathcal{C}\). Closed means: applying the operation stays within the class.
DPDA (Deterministic Pushdown Automaton): A PDA where each configuration has at most one possible next configuration: \(|\delta(q, a, A)| \leq 1\) for all \(q, a, A\), and \(\delta(q, a, A) \neq \emptyset\) implies \(\delta(q, \varepsilon, A) = \emptyset\).
Deterministic Context-Free Language (DCFL): A language recognized by some DPDA. The class of DCFLs is strictly contained within the class of CFLs.
Acyclic PDA: A PDA in which, for every input string \(x\), the computation \(\langle q_0, x, Z_0 \rangle \vdash^* \langle q, \varepsilon, \gamma \rangle\) always terminates — i.e., the PDA always reads the entire input and stops, with no \(\varepsilon\)-moves possible after all input is consumed. Every DPDA can be transformed into an equivalent acyclic DPDA.
Destructive Memory: The characteristic of a stack that once a symbol is popped, it is irrecoverably lost. Contrast with a read/write tape (Turing Machine) where symbols can be read without destroying them.
Bar-Hillel Lemma (Pumping Lemma for CFLs): If \(L\) is a context-free language, then \(\exists m \geq 1\) such that any \(w \in L\) with \(|w| \geq m\) decomposes as \(w = x_1 x_2 x_3 x_4 x_5\) with \(|x_2 x_4| > 0\), \(|x_2 x_3 x_4| \leq m\), and \(x_1 x_2^i x_3 x_4^i x_5 \in L\) for all \(i \geq 1\).
Automata Theory: The branch of theoretical computer science studying abstract mathematical machines (automata) and the computational problems solvable by them.
Model of Computation: A mathematical framework describing how outputs are computed from inputs, and how memory and computation are organized. Examples: FSAs, PDAs, Turing Machines, Lambda calculus, Petri nets.
Turing Machine (TM): A model of computation with an infinite read/write tape. More powerful than PDAs because its memory is persistent — reading a cell does not erase it. The most general model of sequential computation.
Context-Free Language (CFL): A language recognized by some (nondeterministic) pushdown automaton, or equivalently generated by a context-free grammar. The class of CFLs properly contains all regular languages.

3. Formulas

PDT formal definition: \(\text{PDT} = \langle Q, I, \Gamma, \delta, q_0, Z_0, F, O, \eta \rangle\) where \(\delta: Q \times (I \cup \{\varepsilon\}) \times \Gamma \to \mathcal{P}(Q \times \Gamma^*)\) and \(\eta: Q \times (I \cup \{\varepsilon\}) \times \Gamma \to O^*\)
PDT transition (input move): \(\langle q, iy, A\gamma, z \rangle \vdash \langle q', y, \alpha\gamma, zw \rangle\) when \(\delta(q,i,A) = (q',\alpha)\) and \(\eta(q,i,A) = w\)
PDT transition (\(\varepsilon\)-move): \(\langle q, x, A\gamma, z \rangle \vdash \langle q', x, \alpha\gamma, zw \rangle\) when \(\delta(q,\varepsilon,A) = (q',\alpha)\) and \(\eta(q,\varepsilon,A) = w\)
PDT acceptance condition: \(\tau(x) = z \iff \langle q_0, x, Z_0, \varepsilon \rangle \vdash^* \langle q_f, \varepsilon, \gamma, z \rangle\) for some \(q_f \in F\)
Bar-Hillel Lemma: For CFL \(L\), \(\exists m \geq 1\) such that \(\forall w \in L\) with \(|w| \geq m\), \(\exists\) decomposition \(w = x_1 x_2 x_3 x_4 x_5\) with \(|x_2 x_4| > 0\), \(|x_2 x_3 x_4| \leq m\), and \(\forall i \geq 1: x_1 x_2^i x_3 x_4^i x_5 \in L\)
Difference via complement: \(A \cap B = A \setminus B^c\) (used to prove non-closure of DPDA under \(\setminus\) from non-closure under \(\cap\))
De Morgan (set version): \(A \cup B = (A^c \cap B^c)^c\) (used to derive closure/non-closure results)
Closure summary for DPDA: closed under \({}^c\) only; NOT closed under \(\cup\), \(\cap\), or \(\setminus\)

4. Examples

4.1. Build a DPDT Translating \(L_1 = \{a^n b^m \mid n \geq 1 \wedge n \leq m\}\) to \(a^n b^n\) (Lab 7, Task 1)

Build a Deterministic Pushdown Transducer that accepts \(x \in L_1\) where \(L_1 = \{a^n b^m \mid n \geq 1 \wedge n \leq m\}\), and translates it into \(a^n b^n\) (keep only \(n\) \(a\)’s and \(n\) \(b\)’s, discarding the excess \(b\)’s).

Click to see the solution

Key Concept: Push one \(A\) per \(a\) (and output \(a\)). During the \(b\)-phase, pop one \(A\) per \(b\) and output \(b\) — but once the stack is empty (all \(A\)’s matched), continue reading remaining \(b\)’s without outputting anything.

DPDT for translating aⁿbᵐ with n ≤ m into aⁿbⁿ

States: \(q_0\) (reading \(a\)’s), \(q_1\) (reading \(b\)’s, stack not yet empty), \(q_2\) (reading excess \(b\)’s after stack is empty — accepting)

Transitions:

From	To	Label	Meaning
\(q_0\)	\(q_0\)	\(a, Z_0/AZ_0, a\)	Push \(A\), output \(a\)
\(q_0\)	\(q_0\)	\(a, A/AA, a\)	Push \(A\), output \(a\)
\(q_0\)	\(q_1\)	\(b, A/\varepsilon, b\)	First \(b\): pop \(A\), output \(b\)
\(q_1\)	\(q_1\)	\(b, A/\varepsilon, b\)	More \(b\)’s: pop \(A\), output \(b\)
\(q_1\)	\(q_2\)	\(\varepsilon, Z_0/Z_0, \varepsilon\)	Stack empty: all \(a\)’s matched, enter excess phase
\(q_2\)	\(q_2\)	\(b, Z_0/Z_0, \varepsilon\)	Excess \(b\)’s: discard (no output)

Acceptance: \(q_2\) is accepting (after \(\varepsilon\)-move when stack holds only \(Z_0\)). Also need \(n \geq 1\), which is ensured because we must push at least one \(A\) (no path to \(q_2\) without going through \(q_0\)’s push phase).
Trace for \(a^2 b^3\) (i.e., \(n=2, m=3\)):
- Read \(a\): push \(A\), output \(a\)
- Read \(a\): push \(A\), output \(a\)
- Read \(b\): pop \(A\), output \(b\)
- Read \(b\): pop \(A\), output \(b\)
- \(\varepsilon\)-move to \(q_2\) (stack has \(Z_0\))
- Read \(b\): discard, no output
- End: \(q_2\) accepting. Output: \(aabb = a^2 b^2\). ✓

Answer: \(\tau(a^n b^m) = a^n b^n\) for \(m \geq n \geq 1\). Output \(a\)’s and the first \(n\) \(b\)’s; discard remaining \(b\)’s.

4.2. Build a DPDT Reversing Strings in \(L_2 = \{xc \mid x \in \{a,b\}^+\}\) (Lab 7, Task 2)

Build a Deterministic Pushdown Transducer that recognizes \(L_2 = \{xc \mid x \in \{a,b\}^+\}\) and translates \(xc\) into \(x^R\) (the reverse of \(x\)).

Click to see the solution

Key Concept: Use the stack’s LIFO property for reversal. Push every character of \(x\) without outputting anything. When \(c\) is seen, stop pushing and start popping — outputting each symbol as it is popped (which gives the reverse order).

DPDT that reads xc and outputs xᴿ

States: \(q_0\) (push phase — reading \(x\)), \(q_1\) (pop phase — reading \(c\) and producing output), \(q_2\) (accepting)

Transitions:

From	To	Label	Meaning
\(q_0\)	\(q_0\)	\(a, Z_0/AZ_0, \varepsilon\)	Push \(A\), no output
\(q_0\)	\(q_0\)	\(a, A/AA, \varepsilon\)	Push \(A\), no output
\(q_0\)	\(q_0\)	\(a, B/AB, \varepsilon\)	Push \(A\) below \(B\), no output
\(q_0\)	\(q_0\)	\(b, Z_0/BZ_0, \varepsilon\)	Push \(B\), no output
\(q_0\)	\(q_0\)	\(b, B/BB, \varepsilon\)	Push \(B\), no output
\(q_0\)	\(q_0\)	\(b, A/BA, \varepsilon\)	Push \(B\) below \(A\), no output
\(q_0\)	\(q_1\)	\(c, A/\varepsilon, a\)	See \(c\): pop \(A\), output \(a\)
\(q_0\)	\(q_1\)	\(c, B/\varepsilon, b\)	See \(c\): pop \(B\), output \(b\)
\(q_1\)	\(q_1\)	\(\varepsilon, A/\varepsilon, a\)	Pop \(A\), output \(a\)
\(q_1\)	\(q_1\)	\(\varepsilon, B/\varepsilon, b\)	Pop \(B\), output \(b\)
\(q_1\)	\(q_2\)	\(\varepsilon, Z_0/Z_0, \varepsilon\)	Stack empty: accept

Trace for \(abc\) (\(x = ab\), \(x^R = ba\)):
- Push \(A\) (no output), push \(B\) (no output)
- Read \(c\): pop \(B\), output \(b\)
- \(\varepsilon\)-move: pop \(A\), output \(a\)
- \(\varepsilon\)-move: see \(Z_0\) → accept. Output: \(ba\). ✓

Answer: \(\tau(xc) = x^R\). Push \(x\) to stack during push phase; pop during pop phase to output \(x^R\).

4.3. Build a DPDT Translating \(L_1 = \{a^n b^m a^n \mid n \geq 1, m \geq 1\}\) to \(a^n b^m\) (Lab 7, Task 3)

Build a DPDT that accepts \(L_1 = \{a^n b^m a^n \mid n \geq 1 \text{ and } m \geq 1\}\) and translates it into \(a^n b^m\) (drops the trailing \(a^n\)).

Click to see the solution

Key Concept: Output \(a\) for each leading \(a\) and push a counter. Output \(b\) for each \(b\). Once \(b\)’s are done, pop the stack for each trailing \(a\) (no output) — if counts match, accept.

DPDT for aⁿbᵐaⁿ → aⁿbᵐ

States: \(q_0\) (reading leading \(a^n\)), \(q_1\) (reading \(b^m\)), \(q_2\) (reading trailing \(a^n\)), \(q_3\) (accepting)

Transitions:

From	To	Label	Meaning
\(q_0\)	\(q_0\)	\(a, Z_0/AZ_0, a\)	Leading \(a\): push \(A\), output \(a\)
\(q_0\)	\(q_0\)	\(a, A/AA, a\)	More leading \(a\)’s: push \(A\), output \(a\)
\(q_0\)	\(q_1\)	\(b, A/A, b\)	First \(b\): keep stack, output \(b\)
\(q_1\)	\(q_1\)	\(b, A/A, b\)	More \(b\)’s: output \(b\), don’t touch stack
\(q_1\)	\(q_2\)	\(a, A/\varepsilon, \varepsilon\)	First trailing \(a\): pop \(A\), no output
\(q_2\)	\(q_2\)	\(a, A/\varepsilon, \varepsilon\)	More trailing \(a\)’s: pop \(A\), no output
\(q_2\)	\(q_3\)	\(\varepsilon, Z_0/Z_0, \varepsilon\)	Stack empty: all trailing \(a\)’s matched → accept

Trace for \(a^2 b^3 a^2\):
- Read \(a\): push \(A\), output \(a\); read \(a\): push \(A\), output \(a\)
- Read \(b\), \(b\), \(b\): output \(b\), \(b\), \(b\) (stack unchanged: \(A, A, Z_0\))
- Read \(a\): pop \(A\), no output; read \(a\): pop \(A\), no output
- \(\varepsilon\)-move: see \(Z_0\) → \(q_3\), accept. Output: \(aabbb = a^2 b^3\). ✓

Answer: \(\tau(a^n b^m a^n) = a^n b^m\). Output the leading \(a\)’s and \(b\)’s; silently consume trailing \(a\)’s while verifying the count.

4.4. Build a DPDT Translating \(L_2 = \{a^i b^j c^k \mid i+k = j\}\) to \(a^i b^i c^k\) (Lab 7, Task 4)

Build a DPDT that accepts \(L_2 = \{a^i b^j c^k \mid i, j, k \in \mathbb{N}^+, i + k = j\}\) and translates it into \(a^i b^i c^k\) (keep only the first \(i\) \(b\)’s, discarding the extra \(k\) \(b\)’s that were “paid for” by \(c\)’s).

Click to see the solution

Key Concept: The condition \(i + k = j\) means we need \(j = i + k\) \(b\)’s. We output \(a\)’s, then match exactly \(i\) \(b\)’s against the pushed \(A\)’s (outputting \(b\) for each), then silently consume the remaining \(k\) \(b\)’s, then verify \(k\) \(c\)’s.

DPDT for aⁱbʲcᵏ with i + k = j, translating to aⁱbⁱcᵏ

Strategy:
- Push \(A\) for each \(a\) (output \(a\)).
- Match \(b\)’s against \(A\)’s (pop \(A\), output \(b\)) — after \(i\) \(b\)’s, stack holds \(Z_0\).
- Continue reading \(b\)’s while pushing \(B\)’s (no output) — these are the “extra” \(b\)’s.
- Match \(c\)’s against \(B\)’s (pop \(B\), no output) — output \(c\) for each.
- Accept when both \(A\)’s and \(B\)’s are matched.
States: \(q_0\) (reading \(a\)’s), \(q_1\) (reading \(b\)’s, matching \(A\)’s), \(q_2\) (reading extra \(b\)’s, pushing \(B\)’s), \(q_3\) (reading \(c\)’s, matching \(B\)’s), \(q_4\) (accepting)

Transitions:

From	To	Label	Meaning
\(q_0\)	\(q_0\)	\(a, Z_0/AZ_0, a\)	Push \(A\), output \(a\)
\(q_0\)	\(q_0\)	\(a, A/AA, a\)	Push \(A\), output \(a\)
\(q_0\)	\(q_1\)	\(b, A/\varepsilon, b\)	First \(b\): pop \(A\), output \(b\)
\(q_1\)	\(q_1\)	\(b, A/\varepsilon, b\)	More \(b\)’s vs \(A\)’s: pop, output \(b\)
\(q_1\)	\(q_2\)	\(b, Z_0/BZ_0, \varepsilon\)	Stack empty (\(i\) \(b\)’s read): start extra \(b\)’s
\(q_2\)	\(q_2\)	\(b, B/BB, \varepsilon\)	More extra \(b\)’s: push \(B\), no output
\(q_2\)	\(q_3\)	\(c, B/\varepsilon, c\)	First \(c\): pop \(B\), output \(c\)
\(q_3\)	\(q_3\)	\(c, B/\varepsilon, c\)	More \(c\)’s: pop \(B\), output \(c\)
\(q_3\)	\(q_4\)	\(\varepsilon, Z_0/Z_0, \varepsilon\)	All matched: accept

Trace for \(a^2 b^4 c^2\) (\(i=2, j=4, k=2\), and \(i+k = 4 = j\) ✓):
- Push \(A, A\), output \(a, a\)
- Pop \(A\), output \(b\); pop \(A\), output \(b\) (2 \(b\)’s matched \(i=2\) \(a\)’s)
- Push \(B\), push \(B\) (2 extra \(b\)’s)
- Pop \(B\), output \(c\); pop \(B\), output \(c\) (2 \(c\)’s)
- See \(Z_0\) → accept. Output: \(aabbcc = a^2 b^2 c^2 = a^i b^i c^k\). ✓

Answer: \(\tau(a^i b^{i+k} c^k) = a^i b^i c^k\). Use two stack phases: first pop \(A\)’s vs \(b\)’s (outputting \(b\)), then push \(B\)’s for extra \(b\)’s, then pop \(B\)’s vs \(c\)’s (outputting \(c\)).

4.5. Build a DPDT Translating \(L_4 = \{a^n b^m \mid n \leq m \leq 2n\}\) to \(a^n b^n\) (Lab 7, Task 5)

Build a DPDT that accepts \(L_4 = \{a^n b^m \mid m, n \in \mathbb{N}, n \leq m \leq 2n\}\) and translates it into \(a^n b^n\).

Click to see the solution

Key Concept: The condition \(n \leq m \leq 2n\) means the number of \(b\)’s is between \(n\) and \(2n\). Strategy: push two \(A\)’s per \(a\) on the stack (and output \(a\) once per \(a\)). When reading \(b\)’s, pop one \(A\) per \(b\) and output \(b\) only when pops bring the stack to exactly \(Z_0\) or if the remaining count stays within range.

DPDT for aⁿbᵐ with n ≤ m ≤ 2n, translating to aⁿbⁿ

A simpler approach: push two symbols per \(a\), using a two-level pop. For each \(b\), pop one symbol. If the stack reaches \(Z_0\) after between \(n\) and \(2n\) pops, accept (we’ve consumed enough \(b\)’s). The output is \(a^n b^n\) regardless.

Push phase: For each \(a\), push \(A_1\) and \(A_2\) (two markers). Output \(a\).
Pop phase: For each \(b\), pop one marker. Output \(b\) only for the first \(n\) pops.
Accept when stack is empty (between \(n\) and \(2n\) \(b\)’s consumed).

Simplified transition design:

States: \(q_0\) (reading \(a\)’s), \(q_1\) (reading \(b\)’s, matching first set of \(A\)’s — output \(b\)), \(q_2\) (reading extra \(b\)’s, matching second set — no output), \(q_3\) (accepting)

Use stack symbols \(P\) (primary marker — produces output) and \(S\) (secondary marker — no output). Push \(P\) then \(S\) for each \(a\).

From	To	Label	Meaning
\(q_0\)	\(q_0\)	\(a, Z_0/SPZ_0, a\)	Push \(P\) (output) and \(S\) (silent); output \(a\)
\(q_0\)	\(q_0\)	\(a, P/SPP, a\)	Push \(S\) then \(P\) on existing \(P\); output \(a\)
\(q_0\)	\(q_0\)	\(a, S/SSP, a\)	Push \(S\) then \(P\) on existing \(S\); output \(a\)
\(q_0\)	\(q_1\)	\(b, P/\varepsilon, b\)	First \(b\) matching primary: pop \(P\), output \(b\)
\(q_1\)	\(q_1\)	\(b, P/\varepsilon, b\)	More \(b\)’s vs \(P\): pop, output \(b\)
\(q_1\)	\(q_2\)	\(b, S/\varepsilon, \varepsilon\)	\(b\) vs secondary: pop \(S\), no output
\(q_2\)	\(q_2\)	\(b, S/\varepsilon, \varepsilon\)	More \(b\)’s vs \(S\): pop, no output
\(q_1\)	\(q_3\)	\(\varepsilon, Z_0/Z_0, \varepsilon\)	Only \(P\)’s consumed (exactly \(n\) \(b\)’s): accept
\(q_2\)	\(q_3\)	\(\varepsilon, Z_0/Z_0, \varepsilon\)	All markers consumed: accept

Answer: \(\tau(a^n b^m) = a^n b^n\) for \(n \leq m \leq 2n\). Push two stack markers per \(a\); pop one per \(b\), outputting \(b\) only for primary markers; accept when stack is empty.

4.6. DPDA Closure Under Complement — Construction Idea (Lecture 7, Example 1)

Explain why the class of DPDA languages is closed under complement, and outline the construction of the complement DPDA.

Click to see the solution

Key Concept: Complement of a DPDA language is computable by (1) making the DPDA acyclic, (2) completing its transition function, and (3) swapping accepting and non-accepting states.

Why naive state-swapping fails: A DPDA on input \(x\) may end in a non-accepting state \(q\) but then take an \(\varepsilon\)-move to reach accepting state \(q'\). If we simply swap states, \(q\) becomes accepting, so the “complement” DPDA also accepts \(x\) — which is wrong (both accept \(x\), but they should have opposite behavior).
Step 1 — Eliminate loops (make acyclic): Transform the DPDA so that after consuming all input, no further \(\varepsilon\)-moves are possible. This ensures the computation halts in a single well-defined state. Every DPDA can be equivalently expressed as an acyclic DPDA (proof omitted; it requires carefully removing \(\varepsilon\)-move cycles without changing the accepted language).
Step 2 — Complete \(\delta\): Add a non-accepting “error” state \(q_{\text{err}}\) and route all undefined transitions there. Now \(\delta\) is a total function: every configuration has exactly one successor.
Step 3 — Swap accepting and non-accepting states: Since the machine is now acyclic and total, it always halts in exactly one state after reading the full input. Swapping \(F\) and \(Q \setminus F\) correctly inverts the acceptance: strings that were accepted are now rejected, and vice versa.
Result: The complement DPDA recognizes \(L^c\) whenever the original DPDA recognizes \(L\). Therefore DPDA is closed under \({}^c\).

Answer: DPDA is closed under complement via the three-step construction: eliminate loops, complete \(\delta\), swap final/non-final states.

4.7. DPDA Closure Under Difference — Proof by Contradiction (Lecture 7, Example 2)

Prove that DPDA is not closed under set difference (\(\setminus\)).

Click to see the solution

Key Concept: Use the identity \(A \cap B = A \setminus B^c\) and the fact that DPDA is closed under complement to derive a contradiction with non-closure under intersection.

Assume for contradiction that DPDA is closed under \(\setminus\).
Use the identity: For any sets \(A, B\): \[A \cap B = A \setminus B^c\]
Apply closure assumptions:
- Since DPDA is closed under complement (proved): \(B \in \mathbf{DPDA} \Rightarrow B^c \in \mathbf{DPDA}\).
- By the assumed closure under difference: \(A \in \mathbf{DPDA}\) and \(B^c \in \mathbf{DPDA} \Rightarrow A \setminus B^c \in \mathbf{DPDA}\).
- Therefore \(A \cap B = A \setminus B^c \in \mathbf{DPDA}\) for any \(A, B \in \mathbf{DPDA}\).
But: We proved that DPDA is not closed under intersection (using \(\{a^n b^n c^m\} \cap \{a^m b^n c^n\} = \{a^n b^n c^n\} \notin \mathbf{DPDA}\)). Contradiction!
Conclusion: The assumption was wrong. DPDA is not closed under \(\setminus\).

Answer: By contradiction using \(A \cap B = A \setminus B^c\) and DPDA’s closure under complement, non-closure under \(\cap\) implies non-closure under \(\setminus\).

4.8. DPDA Closure Under Union — Counterexample (Lecture 7, Example 3)

Show that the class of languages recognized by DPDAs is not closed under union, using \(L_1 = \{a^n b^n \mid n \geq 1\}\) and \(L_2 = \{a^n b^{2n} \mid n \geq 1\}\).

Click to see the solution

Key Concept: Both \(L_1\) and \(L_2\) are individually recognized by DPDAs, but their union cannot be recognized by any DPDA — because a deterministic machine cannot decide which language to prepare for during the push phase.

Verify \(L_1 \in \mathbf{DPDA}\): Build a DPDA that pushes one \(A\) per \(a\), then pops one \(A\) per \(b\). Accepts when \(Z_0\) is on top after all \(b\)’s. Standard construction.
Verify \(L_2 \in \mathbf{DPDA}\): Build a DPDA that pushes two \(A\)’s per \(a\), then pops one \(A\) per \(b\). Accepts when \(Z_0\) is on top.
Argue \(L_1 \cup L_2 \notin \mathbf{DPDA}\):
- A DPDA reading \(a^n\) must decide in its stack preparation whether it will face \(n\) or \(2n\) \(b\)’s.
- After the \(a\)’s are read, the stack is committed. When the \(b\)’s arrive, the machine cannot re-examine the \(a\)-count.
- A deterministic machine cannot simultaneously prepare for both \(n\) \(b\)’s and \(2n\) \(b\)’s.
- A formal proof uses the fact that \(\{a^n b^n c^n\} \notin \mathbf{PDA}\) (via Bar-Hillel Lemma) combined with algebraic manipulation.
Conclusion: \(L_1 \cup L_2 \notin \mathbf{DPDA}\), so DPDA is not closed under \(\cup\).

Answer: DPDA is not closed under union. The union \(\{a^n b^n\} \cup \{a^n b^{2n}\}\) witnesses this, as neither component alone poses a problem but their union does.

4.9. FST — Translating All Strings Over \(\{a,b\}\) (Tutorial 7, Example 1)

Build a Finite State Transducer that accepts any string \(x \in \{a, b\}^*\) and translates it by the rule \(a \mapsto A\), \(b \mapsto B\) (convert lowercase to uppercase).

Click to see the solution

Key Concept: Since every string is accepted, a single accepting state with self-loops suffices.

One-state FST translating a→A and b→B

Define the FST components:
- States: \(Q = \{q_0\}\); initial state \(q_0\); accepting: \(F = \{q_0\}\)
- Input alphabet: \(I = \{a, b\}\); Output alphabet: \(O = \{A, B\}\)
Transitions (both are self-loops on \(q_0\)):
- \(q_0 \xrightarrow{a/A} q_0\): read \(a\), write \(A\)
- \(q_0 \xrightarrow{b/B} q_0\): read \(b\), write \(B\)
Trace for input \(aba\):
- Read \(a\) in \(q_0\): output \(A\), stay in \(q_0\)
- Read \(b\) in \(q_0\): output \(B\), stay in \(q_0\)
- Read \(a\) in \(q_0\): output \(A\), stay in \(q_0\)
- End of input; \(q_0 \in F\) → accepted

Answer: Output \(ABA\). The FST has one state \(q_0\) (accepting) with self-loops \(a/A\) and \(b/B\).

4.10. FST — Translating Only Strings Without \(b\)’s (Tutorial 7, Example 2)

Build a Finite State Transducer that accepts strings \(x \in \{a, b\}^*\) containing no \(b\)’s, and translates them by \(a \mapsto A\), \(b \mapsto B\). Strings containing \(b\) are rejected (translation undefined).

Click to see the solution

Key Concept: Use a sink (non-accepting) state to handle \(b\)’s. Once a \(b\) is seen, the string is rejected regardless of further input.

FST accepting only strings without b’s

States: \(q_0\) (accepting — no \(b\) seen), \(q_1\) (non-accepting sink — \(b\) seen)
Transitions:
- \(q_0 \xrightarrow{a/A} q_0\): read \(a\) in \(q_0\), output \(A\), stay
- \(q_0 \xrightarrow{b/B} q_1\): read \(b\) in \(q_0\), output \(B\), go to sink
- \(q_1 \xrightarrow{a/\varepsilon} q_1\): read \(a\) in sink, output nothing
- \(q_1 \xrightarrow{b/\varepsilon} q_1\): read \(b\) in sink, output nothing
Trace for \(aaa\): stays in \(q_0\), outputs \(AAA\) → accepted, translation = \(AAA\)
Trace for \(aba\): \(q_0 \xrightarrow{a/A} q_0 \xrightarrow{b/B} q_1 \xrightarrow{a/\varepsilon} q_1\) → rejected (translation undefined)

Answer: The FST has states \(\{q_0, q_1\}\) with \(q_0\) accepting. Strings without \(b\)’s are translated to their uppercase version; strings with \(b\)’s are rejected.

4.11. PDT — Translating \(a^n b^n \to A^n B^n\) (Tutorial 7, Example 3)

Build a Pushdown Transducer for \(L = \{a^n b^n \mid n \geq 1\}\) that translates each string by \(a \mapsto A\), \(b \mapsto B\), i.e., \(\tau(a^n b^n) = A^n B^n\).

Click to see the solution

Key Concept: This translation requires a stack because we need to count \(n\) \(a\)’s and then verify \(n\) \(b\)’s. An FST cannot count to an arbitrary \(n\).

PDT translating aⁿbⁿ into AⁿBⁿ

States: \(q_0\) (start), \(q_1\) (reading \(a\)’s), \(q_2\) (reading \(b\)’s), \(q_3\) (accepting)
Stack alphabet: \(\Gamma = \{Z_0, A\}\)

Transitions:

From	To	Label	Meaning
\(q_0\)	\(q_1\)	\(a, Z_0/AZ_0, A\)	First \(a\): push \(A\), output \(A\)
\(q_1\)	\(q_1\)	\(a, A/AA, A\)	More \(a\)’s: push \(A\), output \(A\)
\(q_1\)	\(q_2\)	\(b, A/\varepsilon, B\)	First \(b\): pop \(A\), output \(B\)
\(q_2\)	\(q_2\)	\(b, A/\varepsilon, B\)	More \(b\)’s: pop \(A\), output \(B\)
\(q_2\)	\(q_3\)	\(\varepsilon, Z_0/Z_0, \varepsilon\)	Stack empty (only \(Z_0\)): accept

Trace for \(aabb\):
- \(\langle q_0, aabb, Z_0, \varepsilon \rangle \vdash \langle q_1, abb, AZ_0, A \rangle\) (output \(A\))
- \(\langle q_1, abb, AZ_0, A \rangle \vdash \langle q_1, bb, AAZ_0, AA \rangle\) (output \(A\))
- \(\langle q_1, bb, AAZ_0, AA \rangle \vdash \langle q_2, b, AZ_0, AAB \rangle\) (output \(B\))
- \(\langle q_2, b, AZ_0, AAB \rangle \vdash \langle q_2, \varepsilon, Z_0, AABB \rangle\) (output \(B\))
- \(\langle q_2, \varepsilon, Z_0, AABB \rangle \vdash \langle q_3, \varepsilon, Z_0, AABB \rangle\) — ACCEPT

Answer: \(\tau(a^n b^n) = A^n B^n\). The PDT uses a stack to count \(a\)’s, then matches \(b\)’s one-for-one.

4.12. PDT — Translating \(a^n b^n \to b^n\) (Tutorial 7, Example 4)

Build a Deterministic Pushdown Transducer for \(L = \{a^n b^n \mid n > 0\}\) with translation \(\tau(a^n b^n) = b^n\) (output only the \(b\)-part, discarding the \(a\)’s).

Click to see the solution

Key Concept: During the push phase (reading \(a\)’s), output nothing (\(\varepsilon\)). Only during the pop phase (reading \(b\)’s), output \(b\) for each pop.

PDT translating aⁿbⁿ into bⁿ

States: \(q_0\) (reading \(a\)’s), \(q_1\) (reading \(b\)’s), \(q_F\) (accepting)

Transitions:

From	To	Label	Meaning
\(q_0\)	\(q_0\)	\(a, Z_0/AZ_0, \varepsilon\)	First \(a\): push \(A\), NO output
\(q_0\)	\(q_0\)	\(a, A/AA, \varepsilon\)	More \(a\)’s: push \(A\), NO output
\(q_0\)	\(q_1\)	\(b, A/\varepsilon, b\)	First \(b\): pop \(A\), output \(b\)
\(q_1\)	\(q_1\)	\(b, A/\varepsilon, b\)	More \(b\)’s: pop \(A\), output \(b\)
\(q_1\)	\(q_F\)	\(\varepsilon, Z_0/Z_0, \varepsilon\)	Stack empty: accept, no output

Trace for \(aaabb\) (i.e., \(n = 3\)… but \(aaabb\) has 3 \(a\)’s and 2 \(b\)’s, not in \(L\); use \(aabb\)):
- Push \(A\), push \(A\) (no output), then pop \(A\) (output \(b\)), pop \(A\) (output \(b\)), see \(Z_0\) → accept.
- Output: \(bb = b^2\).

Answer: \(\tau(a^n b^n) = b^n\). The stack counts \(a\)’s; output is generated only in the pop phase.

4.13. PDT — Translating \(a^n b^n \to b^n a^n\) (Tutorial 7, Example 5)

Build a DPDT for \(L = \{a^n b^n \mid n > 0\}\) with translation \(\tau(a^n b^n) = b^n a^n\) (swap the two blocks).

Click to see the solution

Key Concept: Output \(b\) during the push phase (reading \(a\)’s), and output \(a\) during the pop phase (reading \(b\)’s). This “swaps” the outputs.

PDT translating aⁿbⁿ into bⁿaⁿ

States: \(q_0\) (reading \(a\)’s), \(q_1\) (reading \(b\)’s), \(q_F\) (accepting)

Transitions:

From	To	Label	Meaning
\(q_0\)	\(q_0\)	\(a, Z_0/AZ_0, b\)	First \(a\): push \(A\), output \(b\)
\(q_0\)	\(q_0\)	\(a, A/AA, b\)	More \(a\)’s: push \(A\), output \(b\)
\(q_0\)	\(q_1\)	\(b, A/\varepsilon, a\)	First \(b\): pop \(A\), output \(a\)
\(q_1\)	\(q_1\)	\(b, A/\varepsilon, a\)	More \(b\)’s: pop \(A\), output \(a\)
\(q_1\)	\(q_F\)	\(\varepsilon, Z_0/Z_0, \varepsilon\)	Done: accept

Trace for \(aabb\):
- Read \(a\): output \(b\), push \(A\)
- Read \(a\): output \(b\), push \(A\)
- Read \(b\): output \(a\), pop \(A\)
- Read \(b\): output \(a\), pop \(A\)
- See \(Z_0\) → accept. Output: \(bbaa = b^2 a^2\). ✓

Answer: \(\tau(a^n b^n) = b^n a^n\). Output \(b\)’s during push and \(a\)’s during pop.

4.14. DPDA — Constructing the DPDA for \(L_1 = \{a^n b^n c^m \mid n, m > 0\}\) (Tutorial 7, Example 6)

Build a DPDA that recognizes \(L_1 = \{a^n b^n c^m \mid n, m > 0\}\).

Click to see the solution

Key Concept: Push \(A\)’s for each \(a\), match \(b\)’s against them, then accept any positive number of \(c\)’s (the \(c\)’s don’t affect the stack count).

DPDA for aⁿbⁿcᵐ

States: \(q_0\) (start), \(q_1\) (reading \(a\)’s), \(q_2\) (reading \(b\)’s), \(q_3\) (reading \(c\)’s — accepting)

Transitions:

From	To	Label	Meaning
\(q_0\)	\(q_1\)	\(a, Z_0/AZ_0\)	First \(a\): push \(A\)
\(q_1\)	\(q_1\)	\(a, A/AA\)	More \(a\)’s: push \(A\)
\(q_1\)	\(q_2\)	\(b, A/\varepsilon\)	First \(b\): pop \(A\)
\(q_2\)	\(q_2\)	\(b, A/\varepsilon\)	More \(b\)’s: pop \(A\)
\(q_2\)	\(q_3\)	\(c, Z_0/Z_0\)	First \(c\) (stack shows \(Z_0\), so \(a\)’s matched \(b\)’s): enter accepting phase
\(q_3\)	\(q_3\)	\(c, Z_0/Z_0\)	More \(c\)’s: stay

Acceptance: \(q_3\) is the accepting state; reached only when \(n\) \(b\)’s exactly matched \(n\) \(a\)’s and at least one \(c\) was seen.

Answer: The DPDA with states \(q_0, q_1, q_2, q_3\) (accepting) as above recognizes \(L_1 = \{a^n b^n c^m \mid n, m > 0\}\).

4.15. DPDA — Constructing the DPDA for \(L_2 = \{a^m b^n c^n \mid n, m > 0\}\) (Tutorial 7, Example 7)

Build a DPDA that recognizes \(L_2 = \{a^m b^n c^n \mid n, m > 0\}\).

Click to see the solution

Key Concept: Skip over the \(a\)’s (they don’t constrain anything), then count \(b\)’s by pushing to the stack, then verify the \(c\)’s by popping.

DPDA for aᵐbⁿcⁿ

States: \(q_0\) (start), \(q_1\) (reading \(a\)’s), \(q_2\) (reading \(b\)’s), \(q_3\) (reading \(c\)’s), \(q_4\) (accepting)

Transitions:

From	To	Label	Meaning
\(q_0\)	\(q_1\)	\(a, Z_0/Z_0\)	First \(a\): no stack change
\(q_1\)	\(q_1\)	\(a, Z_0/Z_0\)	More \(a\)’s: no stack change
\(q_1\)	\(q_2\)	\(b, Z_0/BZ_0\)	First \(b\): push \(B\)
\(q_2\)	\(q_2\)	\(b, B/BB\)	More \(b\)’s: push \(B\)
\(q_2\)	\(q_3\)	\(c, B/\varepsilon\)	First \(c\): pop \(B\)
\(q_3\)	\(q_3\)	\(c, B/\varepsilon\)	More \(c\)’s: pop \(B\)
\(q_3\)	\(q_4\)	\(\varepsilon, Z_0/Z_0\)	Stack has only \(Z_0\): all \(b\)’s matched \(c\)’s → accept

Acceptance: \(q_4\) is accepting, reached via \(\varepsilon\)-move when stack shows \(Z_0\).

Answer: The DPDA with states \(q_0, q_1, q_2, q_3, q_4\) (accepting) as above recognizes \(L_2 = \{a^m b^n c^n \mid n, m > 0\}\).

4.16. DPDA Closure Under Intersection — Counterexample (Tutorial 7, Example 8)

Show that \(L_1 \cap L_2 = \{a^n b^n c^n \mid n > 0\}\) using \(L_1 = \{a^n b^n c^m \mid n,m > 0\} \in \mathbf{DPDA}\) and \(L_2 = \{a^m b^n c^n \mid n,m > 0\} \in \mathbf{DPDA}\), and conclude that DPDA is not closed under intersection.

Click to see the solution

Key Concept: Both \(L_1\) and \(L_2\) are DPDAs (built in previous examples). Their intersection is \(\{a^n b^n c^n\}\), which is not even a CFL (provable via Bar-Hillel).

Compute the intersection: \[L_1 \cap L_2 = \{a^n b^n c^m\} \cap \{a^m b^n c^n\} = \{a^n b^n c^n \mid n > 0\}\] (A string belongs to both iff the \(a\)-count equals the \(b\)-count AND the \(b\)-count equals the \(c\)-count, so all three counts are equal.)
Apply Bar-Hillel Lemma to \(\{a^n b^n c^n\}\): Assume it is a CFL. Let \(m\) be the Bar-Hillel constant. Choose \(w = a^m b^m c^m\). Any decomposition \(w = x_1 x_2 x_3 x_4 x_5\) with \(|x_2 x_3 x_4| \leq m\) spans at most two of the three symbol groups. Pumping up (\(i = 2\)) increases only those two groups, making the counts unequal. Contradiction. So \(\{a^n b^n c^n\} \notin \mathbf{CFL}\), and thus \(\notin \mathbf{DPDA}\).
Conclusion: \(L_1, L_2 \in \mathbf{DPDA}\) but \(L_1 \cap L_2 \notin \mathbf{DPDA}\). Therefore DPDA is not closed under \(\cap\).

Answer: The intersection \(\{a^n b^n c^n\}\) is not a CFL, so it cannot be recognized by any DPDA. Non-closure under intersection is proved.